14
2
Genotype, Phenotype, and Environment
Fig. 2.1 The relation among genes, mRNA, proteins, and metabolites. The curved arrows in the
upper half of the diagram denote regulatory processes
Table 2.1 Approximate numbers (variety) of different objects in the human body
Object
Number
Genes
30 000
mRNA
10 Superscript 5105
ProteinsSuperscript normal aa
3 times 10 Superscript 53 × 105
Expressed proteinsSuperscript normal bb
10 cubed103–10 Superscript 4104
Cell types
220
CellsSuperscript normal cc
10 Superscript 131013–10 Superscript 141014
Superscript normal aaPotential repertoire
Superscript normal bbIn a given cell type
Superscript normal ccExcluding microbial cells hosted within the body and which may be comparably numerous
The bioinformatics landscape was dramatically transformed by the availability
of whole genomes and, at roughly the same time (although there was no especial
connexion between the developments), whole proteomes and whole metabolomes.
Far wider-ranging comparisons could now be carried out; in particular, a global vision
of regulation seemed to be within grasp. Part III focuses on these developments;
Table 2.1 recalls the magnitude, at the level of the raw materials, of the problems to
be solved.
Genomics is concerned with the analysis of gene sequences, and there are two
main territories of this work: (1) comparison of gene sequences, that is analysis of
the relation of a given sequence with other sequences (external correlations); and
(2) analysis of the succession of symbols in sequences (internal correlations). The
first attempts to elucidate the function of sequences whose function is unknown
were by comparing the “unknown” sequence with sequences of known function. It
is based on the principles that similar sequences encode similar protein structures,
and similar structures encode similar functions (there are, however, many examples
for which these principles do not hold). One also compares sequences known to